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Abstract 

Background: Influenza infection causes respiratory disease that can lead to death. The complex interplay between 
virus-encoded and host-specific pathogenicity regulators - and the relative contributions of each toward viral 
pathogenicity - is not well-understood. 

Results: By analyzing a collection of lung samples from mice infected by A/Vietnam/1 203/2004 (H5N1; VN1203), we 
characterized a signature of transcripts and proteins associated with the kinetics of the host response. Using a new 
geometrical representation method and two criteria, we show that inoculation concentrations and four specific 
mutations in VN1203 mainly impact the magnitude and velocity of the host response kinetics, rather than specific 
sets of up- and down- regulated genes. We observed analogous kinetic effects using lung samples from mice 
infected with A/California/04/2009 (H1N1), and we show that these effects correlate with morbidity and viral titer. 

Conclusions: We have demonstrated the importance of the kinetics of the host response to H5N1 pathogenesis and 
its relationship with clinical disease severity and virus replication. These kinetic properties imply that time-matched 
comparisons of 'omics profiles to viral infections give limited views to differentiate host-responses. Moreover, these 
results demonstrate that a fast activation of the host-response at the earliest time points post-infection is critical for 
protective mechanisms against fast replicating viruses. 

Keywords: Influenza, Host, Response, Kinetics, Magnitude, Velocity, Transcriptomics, Proteomics, Multidimensional, 
Scaling 



Background 

Highly pathogenic H5N1 avian influenza (HPAI) viruses 
cause rare but severe disease in humans, with a case fa- 
tality rate approaching 60% among laboratory-confirmed 
cases. Although direct human-to-human transmission of 
these viruses does not occur, recent evidence suggests 
that only a few molecular changes in the viral surface 
hemagglutinin (HA) protein are sufficient to convert a 
non-transmissible HPAI virus into one capable of drop- 
let transmission in the ferret model [1-3]. These findings 
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raise concern over the possibility of a HPAI virus pan- 
demic, and underscore the critical need for a better un- 
derstanding of the mechanisms that control HPAI virus 
pathogenicity. Furthermore, the molecular dynamics 
of the host response and the complex interplay be- 
tween virus-encoded determinants, host regulatory fac- 
tors, H5N1 pathogenesis and severe lung disease is not 
well understood. Such information is essential for the de- 
velopment of more effective intervention strategies aimed 
at ameliorating human disease and loss of human life 
resulting from HPAI virus infections. 

To unravel this interplay, we performed a systematic 
characterization of the global host response at the tran- 
script and protein levels in lungs of mice infected by a 
panel of viruses that vary in pathogenicity, using a range 
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of inoculation dosages. We also report a systematic collec- 
tion of several corresponding phenotypic variables, includ- 
ing mouse body weights, lung virus titers, viral messenger 
RNA, and viral genomic RNA for each infected animal 
Our panel of viruses included the A/ Vietnam/ 1203/2004 
[H5N1] wild-type virus (VN1203-WT), the lower patho- 
genicity A/California/04/2009 [H1N1] wild-type virus 
(CA04-WT), two previously described VN1203 mutant 
viruses - VN1203-HAavir [2] and VN1203-PB2-627E [4] - 
and two newly generated VN1203 mutant viruses - 
VN1203-NSltrunc and H5N1 VN1203-PBlF2deL 

The H5N1 VN1203-WT virus is extremely virulent in 
mice [5], and elicits a host response that contributes in 
part to its pathogenicity [6]. The H5N1 VN1203-HAavir 
mutant virus harbors an altered multi-basic cleavage 
site - a virulence factor important for expanded tissue 
range [7-9] - and exhibits restricted systemic viral spread 
due to limited HA susceptibility to furin protease activity. 
The H5N1 VN1203-PB2627E mutant possesses an amino 
acid substitution (Lys-to-Glu) at position 627 in the PB2 
polymerase subunit. This mutation is known to confer 
increased polymerase activity in mammalian cells [10], 
and also modulates anti-viral activity, apoptosis, and viral 
clearance [11]. Our newly generated H5N1 VN1203- 
NSltrunc mutant virus produces a 91 amino acid C- 
terminal truncation in the effector domain of the 
NS1 host response antagonist protein. The NS1 pro- 
tein inhibits RIG -I activation [12] and cellular mRNA 
processing [13], and also promotes PI3K activation [13]. 
The truncation results in the loss of the NS1 nuclear 
localization signal, a PI3K-binding motif, and binding do- 
mains that support interactions with the cellular nuclear 
proteins CPSF and poly (A) -binding protein II (PABII) - 
two factors that function in the 3'-end-processing of cel- 
lular pre-mRNAs [14,15]. The newly generated H5N1 
VN1203-PBlF2del mutant lacks expression of the PB1- 
F2 protein, potentially impacting an array of functions. 
PB1-F2 is a viral pathogenicity factor in mammals and 
birds [16], and has been shown to modulate viral poly- 
merase activity [17,18], enhance lung inflammation [19], 
modulate innate immune responses [20,21], and demon- 
strate pro-apoptotic activity [14]. Finally, the CA04-WT 
virus, which is an H1N1 isolate from the 2009 pandemic 
season, induces lower pathogenicity in mice relative to 
VN1203 [5,22]. 

The extensive systematic analysis of transcriptomic 
and proteomic pulmonary responses to wild-type and 
mutant viruses we report here provides an unprece- 
dented opportunity to assess the effect of specific virus 
attenuating mutations on global host responses in vivo, 
as well as the opportunity to directly examine the effects 
of dosage on the elicited host response. Here, we sought 
to specifically address how the kinetics of the host re- 
sponses to the different viral infections differs, and how 



this kinetics is related to the outcome of infection. To 
address this, we used a systems biology approach to 
analyze this dataset, and we developed a new geomet- 
rical representation method and two criteria - the mag- 
nitude and velocity coefficients - to visualize and 
quantify the kinetics of the host response to the different 
viral infections. Using this approach, we have established 
that it is the magnitude and velocity of the early host 
response, rather than engagement of specific biological 
pathways per se, which mainly contributes to the ob- 
served pathogenicity of influenza viruses. Importantly, 
we show that the molecular kinetic of the host re- 
sponse was associated with clinical disease severity and 
virus replication. 

Results 

Assembly of phenotypic variables and lung transcriptomic 
and proteomic profiles from mice infected with low and 
high pathogenicity influenza viruses 

We collected 230 transcriptomic and 198 proteomic pro- 
files from the lungs of 20-week old C57BL/6 mice infected 
with CA04-WT, VN1203-WT, VN1203-HAavir, VN1203- 
PB2627E, VN1203-NSltrunc, or VN1203-PBlF2del. Mice 
were infected with a range of inoculation dosages (from 
10 2 to 10 6 Plaque Forming Unit - PFU) and lungs were 
harvested at 1, 2, 4 and 7 days post-infection (dpi) to meas- 
ure gene expression using oligonucleotide arrays and ob- 
tain proteomic profiles by liquid chromatography-mass 
spectrometry (LC-MS). Time-matched lung samples from 
mock-infected mice were collected and a total number of 
300 transcriptomic profiles and 266 proteomic profiles are 
represented in our dataset. 

Throughout this article, we use the following defini- 
tions to describe the different conditions: (i) biological 
condition refers to all samples infected by the same 
virus, at the same inoculation dose, and from the same 
dpi; (ii) dosage condition refers to samples infected by the 
same virus and at the same inoculation dose; (iii) strain 
condition refers to all samples infected by the same virus. 
Based on these criteria, the dataset is divided into 51 
transcriptomic biological conditions and 42 proteomic 
biological conditions, representing 13 transcriptomic and 
11 proteomic dosage conditions, and 6 strain conditions. 
Additional file 1: Figure SI provides a schematic repre- 
sentation of the collected infected omics profiles across 
the time course. 

The different viruses can be ranked based on their 
pathogenicity as determined by Median Lethal Dose 
(MLD) values in 6-week-old BALB/c mice (Figure 1A). 
The VN1203-WT (MLD < 1 PFU) and the VN1203- 
PBlF2del (MLD = 3.2 PFU) viruses were associated 
with the highest level of pathogenicity. The CA04-WT 
(MLD = 630,957 PFU; previously determined [22]) and the 
VN1203-HAavir (MLD = 316,228 PFU) viruses were 
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Figure 1 MLD and phenotypical variables collected for all viruses at the 10 4 PFU infection dosage. (A) Median lethal dose (MLD) values of 
the different viruses in 6-week-old BALB/c mice. MLD values for VN1203 and mutants were performed in one experiment (3 mice per inoculation 
dosage), and the previously calculated MLD for CA04 is included for comparison [22]. (B) Mice body weight measurements collected each day 
from 1 to 7 days post-infection for the 10 4 PFU infection dosages. For each virus and time point, the mean and the standard deviation of the 
mice body weight measurements are indicated by a vertical bar. (C), (D) and (E) Viral titer, viral messenger RNA, and viral genomic RNA 
measurements collected at 1, 2, 4 and 7 days post-infection for the 10 4 PFU infection dosages. For each virus and time point the set of individual 
measurements are indicated by filled dots and the mean amongst the individual samples is indicated by a horizontal bar. For the experiments 
shown in panels B-E, four to five mice were used for all VN1203-W and pathogenicity mutant infections, while 3-4 mice were used for the CA04 
infections. The same animals were used to derive all the phenotypical data (i.e. weight loss, virus titer and virus mRNA/genomic RNA levels). 
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associated with the lowest level of pathogenicity. The 
VN1203-NSltrunc (MLD = 631 PFU) and the VN1203- 
PB2627E (MLD = 6,310 PFU) viruses showed an inter- 
mediate level of pathogenicity. 

Morbidity associated with infection, determined by 
mouse body weight loss (Figure IB), viral replication quan- 
tified by viral titration (Figure 1C), viral messenger RNA 
(Figure ID), and viral genomic RNA (Figure IE) measured 
by quantitative RT-PCR were used to characterize the 
viruses in our study. These data were collected in the 
same experiment in which mouse lung tissues were 
collected for global transcriptome and proteome profil- 
ing. Over the 7-day time course, VN1203-WT and all of 
the mutant viruses - with the exception of VN1203- 
PB2627E - exhibited similar, rapid weight loss culminat- 
ing in a 20-30% reduction from initial body weights by 
5-7 days post-infection (Figure IB). In contrast, both 
VN1203-PB2627E and CA04-WT displayed a more re- 
duced rate of weight loss, which peaked at -17% on day 
7 post-infection. None of the mice survived to day 7 for 
the VN1203-WT infection at 10 4 PFU. Virus titers were 
roughly distributed from low to high according to virus 
pathogenicity, with CA04-WT exhibiting the lowest 
mean titer in all four time-points and VN1203-WT 
exhibiting the highest (Figure 1C; see Additional file 2: 
Figure S2 for statistical comparisons). The one exception 
was titers observed in VN1203-HAavir infections, which 
rivaled that of VN1203-WT in all time-points. Both viral 
messenger RNA and genomic RNA production levels 
exhibited early (day 1) segregation into two groups, 
with VN1203-PB2627E and WT-CA04 displaying a 
1-1.5 log 10 -fold reduction in expression compared to 
all the other viruses (Figure ID and E and Additional file 2: 
Figure S2). Viral RNA expression levels remained partially 
segregated on day 2, and by day 4, expression levels were 
more similar for all viruses in the panel. 

In brief, influenza viruses used in this study exhibited 
a range of lethality, morbidity, and replication in mice. 
Consistent with our expectations, the VN1203-WT virus 
was the most virulent, with a very low MLD, the most 
rapid weight loss, and the highest virus replication at 
early dpi (day 1 and 2). 

Gene magnitude changes significantly distinguish 
the strain and biological conditions 

Deletion of PB1-F2, K627E substitution in PB2, elimin- 
ation of the HA multi-basic cleavage site, and truncation 
of NS1 all decreased VN1203 virulence in mice, and 
CA04-WT shows the lowest virulence in mice. We first 
aimed to determine whether the attenuated phenotype 
for each virus was the result of a different host response. 

For each biological condition, differentially expressed 
transcripts (DET) and proteins (DEP) compared to the 
mock-infected samples were identified and the overlap 



between the lists of DET or DEP of each virus were 
compared. Figure 2A and B provide visual representa- 
tions of the intersections between the lists of DET 
and DEP for the strain conditions using proportional 
Euler-diagrams [23]. There were large degrees of over- 
lap (defined here as the percentage of DET or DEP 
also found in another condition) between the strain con- 
ditions, between 69.41% - 99.62% and 85.94% - 96.98%, 
DET and DEP, respectively. The VN1203-PBlF2del strain 
condition had the smallest degree of overlap at the 
transcriptomic (69.41%) level, and a relatively small de- 
gree at the proteomic (87.95%) levels. Notably, at 7 dpi, 
only one transcriptomic profile was available for the 
VN1203-PBlF2del strain condition inoculated at the 10 4 
PFU concentration. Hence, with only one sample avail- 
able, the statistical identification of DET lead to the iden- 
tification of a fraction of false-positive, explaining this 
lowest degree of overlap observed for this strain condi- 
tion at the transcriptomic level. The VN1203-WT strain 
condition also showed a relatively small degree of overlap 
at the transcriptomic (85.55%) level; and the VN1203- 
NSltrunc strain condition showed the smallest degree 
of overlap at the proteomic (85.94%) level compared 
to the other strain conditions. It should be noted that 
the strain conditions with the lowest degrees of over- 
lap corresponded to the same strain conditions that 
triggered the largest amount of DET and DEP. Similar 
large amounts of overlap have been identified between 
the dosage conditions, with degrees of overlap ranging 
from 68% to 98.81% for the transcriptome and from 
89.87% to 99.89% for the proteome. Despite differences 
in clinical manifestation of the disease, these results sug- 
gest that the specific mutations in VN1203 examined 
herein resulted in the induction of similar groups of tran- 
scripts and proteins compared to VN1203-WT, implying 
that magnitude and/or the kinetics of dysregulation of 
these overlapping genes might differentiate the viruses. 
Figure 2C is a heatmap of the transcript expression 
values, ratioed to the mock-infected samples, of each 
infected sample of our dataset. This heatmap has been re- 
stricted to the lists of transcripts specific to each strain 
condition (i.e. transcripts found as differentially expressed 
within one viral strain but not in the others). We cannot 
distinguish any particular patterns or sets of transcript spe- 
cific to any viral strain based on these transcript expression 
values. The transcript or protein subsets specific to each 
viral strain are the consequences of small variations in the 
host response, detected by the statistical procedures, that 
have no specific biological meaning. 

Functional enrichment analysis of the biological condi- 
tions also indicated a large degree of overlap in the host 
response - at least qualitatively - irrespective of the virus 
used to infect the mice. For instance, Acute Phase Re- 
sponse Signaling, Antigen Presentation Pathway, LXR/RXR 
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Figure 2 (See legend on next page.) 
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(See figure on previous page.) 

Figure 2 Intersections between the lists of DET and DEP for the strain conditions. (A) and (B) Proportional Euler-diagrams showing the 
intersections between the lists of differentially expressed transcripts and proteins identified for the strain conditions. Proportional Euler-diagrams 
visually represent the cardinalities of sets and intersection sets of differentially expressed transcripts or proteins by area-proportional circle 
graphics. Each list of DE transcripts or proteins is then represented by a circle with a diameter proportional to the cardinality of the list and the 
overlaps between the circles are proportional to the cardinality of the intersections between the lists. For each strain condition the number of 
transcripts or proteins found as DE in the host response is indicated as well as the degree of overlap — quantified as the percentage of 
transcripts or proteins also found as differentially expressed in another condition. (C) Heatmap of the transcriptomic expression values, ratioed to 
mock-infected samples, for each infected sample. The heatmaps have been restricted to the lists of transcripts found as specific for each viral 
strain and the different subsets of transcripts specific are indicated. For each set of specific transcripts, hierarchical clustering have been 
performed and represented using dendrograms. Biological samples have been ordered by strain conditions, sorted by inoculation concentrations 
and then by increasing days post-infection. (D) Heatmaps showing the statistical over-representation of the canonical pathways based on the lists 
of transcripts found as differentially expressed (compared to the mock-infected conditions) for each of the 51 biological conditions. This heatmap 
has been restricted to only display the top canonical pathways over-represented across all the dataset. 

V J 



Activation, and Protein Ubiquitination were pathways uni- 
versally enriched (Figure 2D). Differences between the bio- 
logical conditions were mainly explained by different 
degrees of enrichment scores, smaller for the early time- 
points post-infection and larger for the late time-points 
post-infection. Similarly, within each viral strain, the 
enrichment scores were smaller for the lower infection 
concentrations and larger for the higher infection concen- 
trations. Overall, the qualitative similarity in the host 
transcriptomic and proteomic response to various influ- 
enza viruses suggested that the different viruses stimulate 
activation of common host response signaling mechanisms 
with different degrees of magnitude. 

Altogether, these results demonstrate that it is mainly 
the magnitude of expression changes within the same 
transcripts or proteins, rather than the induction of dif- 
ferent gene sets, that differentiate the host response to 
CA04-WT, VN1203-WT, and H5N1 VN1203 mutants. 
Thus, isogenic viruses and different inoculation dosages 
likely trigger the same differentially regulated genes, 
but with different degrees of magnitude at both the 
transcriptomic and proteomic levels. This analysis was 
performed on the transcriptomic and proteomic levels 
independently, and in parallel. Though we noted this 
magnitude effect was stronger in the proteomic dataset 
compared to the transcriptomic dataset, we found the 
same conclusions in each separate analysis. 

The kinetics of the host response to VN1203-WT is captured 
by 3 main eigentranscripts and 3 main eigenproteins 

To better characterize the host responses, and to be able to 
quantify their kinetics, we used a Singular Value Decom- 
position analysis to capture and cluster transcripts and pro- 
teins associated with the system s dynamics, as described in 
[24]. Transcripts and proteins previously identified as dif- 
ferentially expressed in at least one biological condition 
that correlated with the eigentranscripts or eigenproteins 
(right-singular vectors of the singular value decomposition) 
were identified using the Pearsons coefficient, with 
eigentranscripts or eigenproteins representing significant 



patterns of expression profiles, associated with the kinetics 
of the system. Transcripts and proteins that did not correl- 
ate with any eigentranscript or eigenprotein, respectively, 
were not considered as significantly associated in the dy- 
namics of the host response (Pearsons coefficient cutoff = 
0.65). This approach was used to examine the host 
response to VN1203-WT, independently of the inoculation 
dosage, that was then subsequently used as a reference to 
compare the VN1203 mutants and CA04-WT. Mock 
infected biological conditions were not used in the infer- 
ence of eigentranscripts and eigenproteins, and biological 
conditions at 1 dpi and 2 dpi were not used in the infer- 
ence of the eigenproteins because of the relatively low 
levels of proteins detected as differentially expressed at 
these time-points. 

We found a total of 5,660 transcripts that correlated with 
3 main eigentranscripts (eigentranscript #1 contained 
2,706 transcripts, eigentranscript #2 contained 2,826 
transcripts, and eigentranscript #3 contained 128 tran- 
scripts). There were a total of 162 proteins that correlated 
with 3 main eigenproteins (eigenprotein #1 contained 
59 proteins, eigenprotein #2 contained 86 proteins, 
and eigenprotein #3 contained 17 proteins). Additional 
file 3: Table SI and Additional file 4: Table S2 provide 
the lists of transcripts and proteins associated with each 
eigentranscript and eigenprotein, respectively. The pat- 
terns of the eigentranscripts and eigenproteins identified 
in the VN1203-WT host response are represented in 
Figure 3A and B. Eigentranscript #2 was increasing over 
time, whereas eigentranscript #1 was decreasing over 
time. Eigentranscript #3 showed a distinct pattern with a 
transient decrease at 2 and 4 dpi, followed by an increase 
at 7 dpi. Eigenprotein #1 was increased in abundance 
over time. In brief, the dynamics of the transcriptomic 
and proteomic host response were captured by 3 sets of 
co-expressed transcripts and 3 sets of co-expressed pro- 
teins with different patterns of variations. 

For each set of transcripts and proteins, we identi- 
fied over-represented biological functions, canonical 
pathways, and upstream regulators (Tables 1 and 2). 
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Figure 3 Eigengenes in the kinetics of the VN1203-WT response, and MDS representations of the infected samples. (A) and (B) Profiles 
of the eigentranscripts and eigenproteins identified in dynamic of the host response to H5N1 VN1203 wild-type. Number of transcripts and 
proteins correlating with each eigentranscript and eigenprotein are indicated. All the individual shape represent the transcriptomic or proteomic 
profile of a mice lung infected by the VN1203 wild-type virus. Biological samples have been sorted by inoculation concentrations and then by 
increasing days post-infection. (C) and (D) Multidimensional Scaling representations of the transcriptomic and proteomic profiles of the H5N1 
VN1203 wild-type infected samples. Each dot in the representations is the transcriptomic or proteomic profile of a biological sample plotted in 
the intensity space of expression signals. Pairwise distances between the dots are proportional to the transcriptomic or proteomic distances 
between the samples. Transcriptomic and proteomic distances have been calculated based on the signature of 5,660 transcripts and 162 proteins 
that significantly correlate with one eigentranscript or eigenprotein. Dots are colored in order to indicate the dosage conditions, and biological 
conditions are indicated by the convex hull of the set of biological replicates and labeled to indicate the time point post-infection. The Kruskal 
Stress shown in each representation quantifies the quality of the geometrical representation as a fraction of the information lost during the 
dimensionality reduction procedure. Schematic projections of the Magnitude and Velocity Coefficients are illustrated at 10 4 PFU in the 
transcriptomic MDS representation. The Magnitude Coefficient MC at 2 dpi - quantifying the transcriptomic distance between mocks and 2 dpi 
samples - is illustrated by a dashed red line, and the Velocity Coefficient (VC) at 4 dpi - measuring the velocity between 4 and 2 dpi samples 
divided by time - is illustrated by an arrow dashed red line. 
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Table 1 Functional enrichment of the eigentranscript identified in the host response kinetics to VN1203 



Eigentranscript 
(No. of molecules) 



Bio functions category 

(functions annotation p-value range) 



Canonical pathway (p-value) 



Upstream regulator 
(p-value of overlap) 



eigentranscript #2 (2826) 



eigentranscript #7 (2706) Embryonic development (5.55E-1 1-5.24E-03) 

Organ development (5.55E-1 1-5.24E-03) 

Organismal development (5.55E-1 1-5.24E-03) 

Tissue development (5.55E-1 1-5.24E-03) 

Respiratory system development and 
function (1.19E-10-2.38E-03) 

Cellular function and maintenance 
(3.38E-55-2.01E-09) 

Cellular function and maintenance 
(3.38E-55-2.01E-09) 

Hematological system development and 
function (1 .34E-5 1-3.3 1 E-09) 

Tissue morphology (5.34E-5 1 -2.48E-1 0) 

Cellular growth and proliferation 
(5.55E-1 1-5.24E-03) 

eigentranscript #3 (1 28) Cell cycle (1 .67E-1 1-1 .60E-02) 

Cell cycle (1.67E-1 1-1.60E-02) 

Cellular development (5.83E-1 0-1 .60E-02) 

Hematological system development and 
function (5.83E-1 0-1 .60E-02) 

Hematopoiesis (5.83E-1 0-1 .60E-02) 



Axonal guidance signaling (8.71 E-07) NKX2-1 (5.35E-13) 

Basal cell carcinoma signaling (1.82E-05) PPP3R1 (2.10E-11) 

Glutathione-mediated detoxification (6.61 E-05) DMD (1 36E-05) 

Xenobiotic metabolism signaling (2.19E-04) BMP6 (6.45E-05) 

Tryptophan degradation (4.68E-04) GLI1 (1.38E-04) 

Death receptor signaling (5.01 E-09) LPS* (9.74E-1 1 1 ) 

Apoptosis signaling (7.94E-09) IFNG (2.42E-87) 

Role of pattern recognition receptors in TNF (8.84E-65) 
recognition of bacteria and viruses (1.78E-08) 

Induction of apoptosis by HIV1 (4.07E-08) IL1B (1.15E-56) 

Protein ubiquitination pathway (4.17E-08) TP53 (6.72E-55) 

Pyrimidine deoxyribonucleotides de novo E2F4 (5.42E-12) 
biosynthesis I (5.13E-05) 

TCell receptor signaling (1.70E-04) FOXM1 (1.84E-11) 

iCOS-iCOSL signaling in T helper cells (2.69E-04) CDK4 (1 .44E-1 0) 

Mitotic roles of polo-like kinase (4.47E-04) CDKN1 A (1.09E-09) 

Regulation of IL-2 expression in activated and CCND1 (1.66E-09) 
anergic T lymphocytes (8.51 E-04) 



For each eigentranscript identified in the kinetics of the host response to H5N1 VN1203 wild-type, the number of correlating transcripts is indicated. For each set 
of transcript the top 5 associated over-represented biological functions, canonical pathways, and upstream regulators are indicated. Ingenuity pathway analysis 
was used to determine the top 5 bio function categories, canonical pathways, and upstream regulators. The functions annotation p-value range represents the 
range of p-values for the functions annotations associated with each bio function category. The p-value of overlap associated for each upstream regulator 
indicates the significance of the overlap between the genes targeted by the upstream regulator in the IPKB database and the experimental dataset. ^Upstream 
regulators with an apteryx signify a chemical reagent or chemical drug. 



Eigentranscript #2 profile was enriched for transcripts 
associated with Death Receptor Signaling, Role of Pat- 
tern Recognition Receptors in Recognition of Bacteria and 
Viruses, and Apoptosis Signaling, as well as the Protein 
Ubiquitination Pathway, which has previously been ob- 
served in mouse lung responses following pandemic 1918 
influenza virus infection [25]. There was representation 
of transcripts encoding immuno proteasome compo- 
nents (e.g., PSMB9 and PSMB10), ubiquitin-protein 
conjugates (e.g., USP18 and USP3), and heat shock 
proteins (e.g., HSPA5 and HSPB7) accompanying the 
Protein Ubiquitination Pathway. Eigentranscript #1 profile 
was comprised of transcripts predominantly associated 
with metabolic cellular processes including Glutathione- 
mediated Detoxification (e.g., GSTK1 and GSTA3) and 
Xenobiotic Metabolism Signaling (e.g., AHR and CAT). Re- 
spiratory System Development and Function was an 
enriched biological function for eigentranscript #1 that in- 
cluded RARG and RARB transcripts, two ligand-dependent 
nuclear receptors that modulate an array of cellular re- 
sponses related to cellular growth. Eigentranscript #3 
profile was enriched for transcripts associated with T cell 



responses, CD3G, ZAP70, and ICOS, related to T Cell 
Receptor Signaling, iCOS-iCOSL Signaling in T Helper 
Cells, and Regulation of IL-2 Expression in Activated 
and Anergic T Lymphocytes. Activated CD8 T cells in 
the lungs of mice infected with H5N1 virus are impaired 
in their ability to control viral replication linked to high 
pathogenicity [26]. 

Eigenprotein #2 profile had strong representation 
of proteins associated with Acute Phase Response 
Signaling (e.g., SAA2 and SERPINA1), LXR/RXR Acti- 
vation (e.g., APOA1 and APOA2), Complement System 
(e.g., CFB and C4B) and Coagulation System (e.g., F2 
and FGG), cellular responses that are critical to host 
antiviral defenses. Examination of proteins associated 
with eigenprotein #1 profile showed enrichment of pro- 
teins related to modification of reactive oxygen species. 
For example, there was enrichment of PRDX1, PTPLAD1, 
and ERP29, suggesting heightened oxidative stress re- 
sponses in the lungs of VN1203-infected mice. H5N1 
infection causes acute lung injury and mice inoculated 
with inactivated H5N1 virus have compromised lung 
function, including altered lung elastance and increased 
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Table 2 Functional enrichment of the eigenproteins identified in the host response kinetics to VN1203 



Eigenprotein 
(No. of molecules) 


Bio functions category 

(functions annotation p-value range) 


Canonical pathway (p-value) 


Upstream regulator 
(p-value of overlap) 


eigenprotein #7 (59) 


hree radical scavenging (o.z4b-u/-z.bob-(Jz) 


NRF2-mediated oxidative stress response (240E-05) 


Ud\ (z.//b-UoJ 




Drug meiaDoiisrn p.jut-uj— z.o/t-uzj 


Glutathione-mediated detoxification (2.69E-04) 


KA A DT /V; ~71 P r\6\ 




AAUUILUiy Ulbcabc ^O. I zt-uj— O. I Zt-UJj 


G Beta gamma signaling (2.82E-04) 


TP^3 (8. Al P C\&\ 
\ r JJ [O.^f \ C-\JO) 




DeveiOpiTieilldl GlbOlGtM { 1 .jZt-U^f— Z.JDt-UZJ 


Ceramide signaling (245E-03) 


INrtZLZ (J/fUL-UjJ 




nereanary oisoraer U-3^£~U4— o/ot-uoj 


PIP~> cinnolin (1 HIP f\T\ 

tirz signann p.uzt-uoj 


DQPM 1 /I 1 AP n/l^ 
r otlM I { I . I 0P-U4J 


eigenprotein #2 (86) 


Neurological disease (3.89E-08-8.48E-03) 


Acute phase response signaling (1.26E-25) 


Nitrofurantoin* (3.03E-12) 




Cell-to-cell signaling and interaction 
(8.90E-08-1.08E-02) 


LXR/RXR Activation (1.58E-13) 


Captopril* (1.79E-08) 




Tissue development (8.90E-08-1 .08E-02) 


Clathrin-mediated endocytosis signaling (9.77E-08) 


IL6 (1.83E-07) 




Metabolic disease (2.66E-07-1.02E-02) 


Complement system (3.02E-07) 


Gentamicin* (5.04E-07) 




Lipid metabolism (5.24E-07-1 .1 7E-02) 


Coagulation system (3.55E-07) 


f 3 (8.12E-07) 


eigenprotein #3 (17) 


Cell death and survival (1 .82E-05-4.83E-02) 


Breast cancer regulation by Stathminl (1.70E-05) 


PLG (2.14E-04) 




Hematological system Development and 
function (9.90E-05-448E-02) 


Cardiac |3-adrenergic signaling (1.91E-04) 


APP (1.19E-03) 




Organismal injury and abnormalities 
(1.63E-04-4.83E-02) 


AMPK signaling (2.24E-04) 


PSEN1 (2.06E-03) 




Tissue morphology (8.38E-04-4.19E-02) 


CREB signaling in neurons (4.07E-04) 


MAP2 (2.12E-03) 




Cancer (8.38E-04-4.99E-02) 


Role of NFAT in cardiac hypertrophy (4.57E-07) 


GNG7 (2.12E-03) 



For each eigenprotein identified in the kinetics of the host response to H5N1 VN1203 wild-type, the number of correlating transcripts and proteins are indicated. 
For each set of proteins the top 5 associated over-represented biological functions, canonical pathways, and upstream regulators are indicated. Ingenuity pathway 
analysis was used to determine the top 5 bio function categories, canonical pathways, and upstream regulators. The functions annotation p-value range 
represents the range of p-values for the functions annotations associated with each bio function category. The p-value of overlap associated for each upstream 
regulator indicates the significance of the overlap between the genes targeted by the upstream regulator in the IPKB database and the experimental dataset. 
*Upstream regulators with an apteryx signify a chemical reagent or chemical drug. 



ROS and oxidized phospholipid production in the lung 
[27]. Represented within eigenprotein #3 profile, were 
APOH and PLG proteins involved in opsonization of 
cells, as well as the NADH dehydrogenase components 
NDUFA11 and NDUFA13. 

In conclusion, the three sets of co-expressed transcripts 
identified in response to VN1203-WT were associated 
with immune and apoptosis signaling pathways that in- 
creased over time, metabolic cellular processes that were 
largely decreasing, and T cell signaling pathways that 
exhibited a biphasic pattern. The three sets of proteins 
capturing the host response dynamics were related to 
host antiviral defenses, reactive oxygen species observed 
to increase as infection progressed, and opsonization. 
Notably, upstream regulators highly enriched in each of 
the transcript or protein sets were identified (p-value 
as low as 10~ m for LPS regulating eigentranscript #2), 
which confirmed that this method captures groups of 
co-regulated transcripts/proteins. Importantly, there 
was an overlap of 29 signaling pathways that were 
found in both transcript and protein sets, including 
Role of NFAT in Regulation of the Immune Response, 
Apoptosis Signaling and Coagulation System, suggesting 
complementary sensitivity between the transcriptomic and 
proteomic profiles. 



Different doses of VN1203 WT impact the magnitude 
and velocity of the pulmonary response to infection 

To determine the effects of increasing dosage on the 
kinetics of the host response to VN1203-WT, we 
used Multidimensional Scaling (MDS) [28,29] as a 
visualization method and introduced two new criteria 
quantifying magnitude and velocity changes in the signa- 
ture (Figure 3C and D). Each dot in the MDS representa- 
tions is the transcriptomic or proteomic profile of a 
biological sample, and pairwise distances between dots 
are proportional to the transcriptomic or proteomic dis- 
tances (Euclidean distances) between the samples. The 
MDS representation, calculated using the sets of tran- 
scripts and proteins associated with the VN1203-WT 
kinetics, highlighted interesting differences between the 
dosage conditions. While all samples from different dos- 
ages were close to the mocks at 1 dpi, the late time- 
points followed a similar curved trajectory. Different 
magnitudes, defined as the distance of one biological 
condition to the mock-infected condition, and velocities, 
defined as the speed of the host response moving from 
one time point to the next time point, were observed for 
the three dosage conditions for both the transcriptomic 
and proteomic levels. For example, infection with the 
highest dose of VN1203-WT virus (10 4 PFU) induced a 
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similar change in host response at 4 dpi as VN1203-WT 
infection with the lowest dose (10 2 PFU) observed 3 days 
later at 7 dpi (Figure 3C). Compared to the transcriptomic 
profiles, the MDS representation of the proteomic profiles 
showed more noise in the data (Figure 3D). For instance, 
we observed that the variations within the groups were as 
high as the variations between groups. The proteomic sam- 
ples collected at 1 and 2 days post-infection samples were 
closely clustered with the mocks. Although we were unable 
to differentiate these samples, we observed a similar effect 
of dose on the trajectories at 4 and 7 dpi, with the highest 
dose reaching the end of the trajectory more quickly com- 
pared to the lower doses. As such, the biological interpreta- 
tions derived from the proteomic analysis have to be taken 
into careful considerations. 

In order to quantify these magnitude and velocity ef- 
fects in the kinetics of the host response, we defined two 
criteria, the Magnitude Coefficient (MC) and the Velocity 
Coefficient (VC). The MC quantifies the magnitude ef- 
fect as the transcriptomic distance from one biological 
condition to the matched mock-infected condition. The 
VC quantifies the velocity effect as the speed of the 
transcriptomic host response moving from one time 
point to the next in the succession of infection. Both the 



MC and VC were calculated based on the centroids 
(arithmetic means) of the biological conditions and the 
transcriptomic distances (Euclidean distances) based on 
the signatures of transcripts associated with the kinetics 
of the host response to VN1203-WT (Additional file 5: 
Table S3, Figure 4). MC and VC were calculated only for 
the transcriptomic data, as we were unable to discrimin- 
ate proteomic samples at early time-points (1 and 2 dpi), 
resulting in a lack of sensitivity for the analysis. We first 
observed that increasing the inoculation dose was related 
to an increase in magnitude of the host response. Indeed, 
at each dpi, infection with 10 2 PFU triggered a lower MC 
compared to 10 3 PFU, while 10 4 PFU triggered the stron- 
gest MC. Moreover, different inoculation dosages also 
resulted in different velocities of the host response changes. 
With regards to time, the greatest transcriptomic VC dif- 
ference was found at 2 dpi, followed by the difference 
quantified between 1 and 2 dpi. With regards to dosage, 
the lowest transcriptomic VC coefficient difference was 
found for 10 2 PFU and the greatest transcriptomic VC co- 
efficient difference was found for 10 4 PFU. 

These results show that a similar kinetic signature of 
5,660 transcripts and 162 proteins is dysregulated after 
infection with VN1203-WT at different doses, and that 




1dpi 2dpi 4dpi 7dpi 1dpi 2dpi 4dpi 7dpi 1dpi 2dpi 4dpi 7dpi 1dpi 2dpi 4dpi 7dpi 1dpi 2dpi 4dpi 7dpi 1dpi 2dpi 4dpi 7dpi 

Time post-infection 




— i 1 1 1 — — i 1 1 1 — — i 1 1 1 — — i 1 1 1 — — i 1 1 1 — — i 1 1 1 — 
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Time post-infection 



Figure 4 Profiles of Magnitude Coefficients and Velocity Coefficients for the different viruses and dosage conditions. (A) Variation of 

Magnitude Coefficients over time for the different viruses. (B) Variation of Velocity Coefficients over time for the different viruses. Profiles are 

colored by viral strain and line types represent the different doses. The Magnitude Coefficient (MC) quantifies the magnitude effect as the 

transcriptomic or proteomic distance from one biological condition to the matched mock-infected condition. The Velocity Coefficient (VC) 

quantifies the velocity effect as the speed of the transcriptomic host response to move from one time point to the next one. Both the MC and 

VC were calculated based on the centroids of the biological conditions and the transcriptomic or proteomic distances are calculated based on 

the lists of transcripts associated with the kinetics of the host response to VN1203 wild-type. 
\ J 
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increasing dosages lead to higher magnitude changes of 
these transcripts/proteins and an increased velocity, es- 
pecially between 1 and 2 dpi. 

Mutations in VN1203 impact the magnitude and velocity 
of the host response 

Having characterized VN1203-WT dynamics, we were 
now able to determine how each specific VN1203 mu- 
tant altered the kinetics of the host responses. In par- 
ticular, our aim was to better quantify the magnitude 
differences between WT and mutants that were intro- 
duced with Figure 2. We next developed a new geomet- 
rical representation method in order to compare the 
kinetics of the host response to the various VN1203 
mutants relative to the host response to VN1203-WT 
at the transcriptomic level. Using the MDS representa- 
tion presented in Figure 3C as a reference, we projected 
the transcriptomic profiles of samples infected by the 
four VN1203 mutant viruses individually at 10 4 PFU 
(Figure 5A-D) and, in the case of VN1203-NSltrunc 
and VN1203-PBlF2del, at 10 3 and 10 4 PFU dosages 
(Figure 5C and D). While traditional MDS methods 
project a set of high dimensional objects into a lower 
dimensional space for visualization purposes, the MDS 
method that we developed allows for projection of add- 
itional objects over a predefined MDS representation 
(see Methods section). Thus, the resulting representa- 
tion allows us to visualize the similarities and differ- 
ences between the WT and mutant samples, and MC 
and VC allow us to quantify the kinetic changes in the 
host response related to magnitude and velocity, as pre- 
viously described for VN1203-WT. The reference repre- 
sentation is named a MDS Reference Map and the 
resulting projections are named MDS Projections. Quan- 
tification of magnitude and velocity revealed different 
information about each mutant. 

For VN1203-PB2627E, both the MDS projection 
(Figure 5A) and the MC and VC profiles (Figure 4) for the 
10 4 dosage were similar to that observed for VN1203-WT 
at the 10 2 dosage. Therefore, the main effect of the PB2- 
627E mutation on the host-response was to decrease it to a 
level similar to that induced by 100 times less VN1203- 
WT. This is consistent with the reduced replication we 
observed for VN1203-PB2627E in the current study 
(Figure IB) and with a previous study showing that the 
main effect of the PB2-627E mutation in the VN1203 back- 
ground was to reduce virus replication in the lung [5]. 

The main effect of VN1203-HAavir was also found to 
be an attenuated host response, with the VN1203- 
HAavir 10 4 dosage exhibiting an MDS projection and 
VC/MC profiles similar to that observed for VN1203 
WT at the 10 3 dosage (Figures 4, and 5B). The only dif- 
ference was that VC and MC at 4 dpi were slightly lower 
than the VN1203-WT, while no difference in weight loss 



or viral replication was observed between the two vi- 
ruses at that dpi. 

Strikingly, whereas VN1203-PBlF2del VC and MC 
profiles at 10 3 PFU were intermediate between VN1203- 
WT 10 2 and 10 3 , at the highest dose (10 4 PFU), both the 
peak of VC at 2 dpi and the peak of MC at 7 dpi were 
higher than VN1203-WT at the same dose (Figures 4 
and 5C). It should be noted that both body weight loss 
and virus replication were similar at this dose between 
the VN1203-WT and this mutant (Figure 1). Moreover, 
during the course of infection, four out of the five mice 
died between 4 and 7 dpi, indicating a high virulence of 
this mutant at 10 4 PFU. 

Finally, the MC and VC profiles associated with VN1203- 
NSltrunc mutation largely changed at early dpi (1 and 
2 dpi). At 1 dpi, these 2 coefficients were higher in the 
mutant than in the matched dose of the WT, but lower 
at 2 dpi (Figures 4, and 5D). This was related to the en- 
hanced, early expression (i.e. at 1 dpi, relative to 2 dpi for 
all other viruses) of transcripts directly involved in the 
innate immune response and inflammation (Additional 
file 6: Figure S3), which was also reported previously 
[31]. It was interesting to find that this behavior was very 
similar to the CA04 at the highest dose (10 6 PFU). 
CA04-WT was used as a much lower pathogenicity com- 
parator which confirmed that the main effect of increasing 
infection dose was to increase the magnitude of the host- 
response (Additional file 7: Figure S4, and Figure 4). 

To summarize, each mutant of VN1203 dysregulated 
the kinetic signature of 5,660 transcripts with a specific 
dynamic in terms of both magnitude of changes (MC), 
and speed at which they occurred (VC). While VN1203- 
PB2E627 and HAavir induced changes similar to those 
induced by 100 or 10 times lower doses of VN1203-WT 
respectively, VN1203-PBlF2del had a bimodal response 
with attenuation at 10 3 PFU, but an increase in both 
magnitude and velocity at 10 4 PFU. Finally, VN1203- 
NSltrunc induced changes in the 5,660 transcripts with 
a kinetics and magnitude more similar to CA04-WT 10 6 
PFU than VN1203-WT. 

Magnitude and velocity of the host responses are 
associated with weight loss and viral titer 

To determine whether the VC and MC were related to 
disease outcome and viral pathogenicity, we performed a 
correlation analysis between these coefficients and the 
phenotypic variables. All biological conditions were consid- 
ered together in these analyses to assess the correlations 
with a large significance. For each biological condition, we 
determined the mean body weight of the animals, viral 
titer, viral messenger RNA and viral genomic RNA mea- 
surements in the lung, and calculated the difference be- 
tween the means of any two consecutive time-points for 
viral titer (A mean viral titer), viral RNA (A mean viral 
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Figure 5 MDS Projections of the VN1203 mutants infected transcriptomic profiles over the VN1203-WT Reference Map. 

(A) Multidimensional Scaling Projection (MDS Projection) of the transcriptomic profiles of the VN1203-PB2627E dosage condition over the VN1203 
wild-type Multidimensional Scaling Reference Map (MDS Ref Map). (B) MDS Projection of the transcriptomic profiles of the VN1203-HAavir dosage 
condition over the VN 1203 wild-type MDS Ref Map. (C) MDS Projection of the transcriptomic profiles of the VN1 203-PB1 F2del dosage conditions 
over the VN1203 wild-type MDS Ref Map. (D) MDS Projection of the transcriptomic profiles of the VN1203-NS1trunc dosage conditions over the 
VN1203 wild-type MDS Ref Map. Each dot is the transcriptomic profile of a biological sample plotted in the intensity space of gene expression. 
Pairwise distances between the dots are proportional to the transcriptomic distances between the samples. MDS Projections allow to project 
additional 'omics profiles over a predefined MDS representation (MDS Ref Map). Transcriptomic distances have been calculated based on the 
signature of 5,660 transcripts that significantly correlate with one eigentranscript. Dots are colored in order to indicate the dosage conditions, and 
biological conditions are indicated by the convex hull (i.e. the smallest convex set containing the points [30]) of the set of biological replicates and 
labeled to indicate the time point post-infection. Samples and biological conditions of the H5N1 VN1203 wild-type 10 4 PFU infection dosage are 
indicated by gray dots and gray convex hulls. Hence the grey spots that are connected represent the -omics profiles of mice lung infected by the 
VN 1 203 wild-type virus at 1 0 4 PFU, while the ones not connected represent the -omics profiles for the other infection concentrations. The Kruskal 
Stress shown in each representation quantifies the quality of the geometrical representation as a fraction of the information lost during the 
dimensionality reduction procedure. 
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mRNA), and viral genomic RNA (A mean viral gRNA). For 
each transcriptomic biological condition the associated 
MC, VC, mean body weight, A mean viral titer, A mean 
viral mRNA, and A mean viral gRNA values are shown in 
Additional file 5: Table S3. A significant association was 
found between MC and mean body weight (Spearman's 
rho = -0.7984), suggesting that the magnitude of host 
response is significantly associated with morbidity 
(Figure 6A). The VC was not significantly associated with 
any criteria when considering phenotypic variables within 
a given time-point; however, the VC was significantly asso- 
ciated with A mean viral titer (Spearman's rho= 0.8459, 
Figure 6B), A mean viral mRNA (rho =0.8544), and A 
mean viral gRNA (rho =0.8426) at the previous time-point. 
This suggests that the velocity of the host response can be 
predicted by changes in viral replication at the preceding 
time-point. 

In conclusion, the magnitude of changes in the host 
response of the 5,660 transcripts was correlated with 
morbidity at similar time-points, while the velocity of 
these changes between two time-points was associated 
with extent of changes in viral replication at previous 
time-points. 

Discussion 

Highly pathogenic avian influenza H5N1 virus is known 
to induce aberrant host responses leading to severe im- 
munopathology in the lung. In particular, the potent 
pro-inflammatory response, commonly referred to as 
cytokine storm', is the suspected cause of acute lung in- 
jury [32]. Genomic evidence from animal model systems 
shows HPAI H5N1 viruses strongly enhance cytokine and 



chemokine transcriptional responses compared to seasonal 
influenza viruses (reviewed in [33]) and 2009 pandemic 
H1N1 influenza viruses [34]. Here, we utilized a mouse 
model to study the system dynamics of H5N1 infection, 
comparing and contrasting the effects of increasing dosage 
and different previously described and newly generated 
viral mutants with varying degrees of pathogenicity toward 
the kinetics of pulmonary responses. Using a systems ap- 
proach, we present a temporal analysis of transcriptomic 
and proteomic profiles measured from mouse lung sam- 
ples collected during the acute phase of infection in order 
to better understand both dose-dependent and temporal 
mechanisms of host responses to influenza infection. 

We report that the wild-type and mutant viruses differ- 
entially regulate a common set of transcripts and pro- 
teins, demonstrating a large degree of overlap in the host 
response among the different strain and dosage condi- 
tions. Regardless of the virus genetic change, we show 
that the host response is mainly impacted by the magni- 
tude of gene expression and the speed at which these 
changes occur for this core set of transcripts, as opposed 
to transcriptomic differences explained by each specific 
mutation. We identified three main eigentranscripts as- 
sociated with host immune mechanisms, including meta- 
bolic cellular processes and T cell signaling pathways, 
and three main eigenproteins associated with host anti- 
viral defenses. Through a new geometric representation 
method and two criteria (the magnitude and velocity co- 
efficients), we were able to visualize and quantify the 
magnitude and velocity kinetic effects of the host re- 
sponse to wild-type and mutant H5N1 VN1203 viruses, 
which were influenced by both the infection dosage and 
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Figure 6 Spearman's coefficients between MC and body weight and between VC and titer changes. (A) Scatter plot between the 


Magnitude Coefficients and the mice body weight loss measurements. (B) Scatter plot between the Velocity Coefficients and the changes in viral 


titer measurements at the previous time-point. Each dot represents a couple of values for a single biological condition and is colored in order to 
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well as the associated p-value. 
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specific modifications to key H5N1 virulence determinants. 
Intriguingly, two newly generated H5N1 VN1203 mutant 
viruses, VN1203-PBlF2del and VN1203-NSltrunc, had 
the most distinct profiles, with apparent displacement rela- 
tive to all other viruses at 2 dpi and 1 dpi, respectively. The 
divergent kinetic effects occurred mostly at early time- 
points (between days 1 and 2 post-infection), and we 
observed analogous kinetic effects with different doses of 
low pathogenicity 2009 pandemic H1N1 influenza virus 
(CA04-WT). Importantly, we demonstrate that the magni- 
tude and velocity kinetic effects were associated with clin- 
ical disease severity and virus replication. Specifically, 
mouse weight loss correlated with the magnitude of the 
host response, and infectious viral particle production at a 
given time point correlated with the velocity of the host re- 
sponse at the next time-point. 

Compared to the differentially expressed transcripts, 
changes in protein abundance were less apparent, due in 
part to inherent noise in the data that makes computa- 
tional analyses of proteomic measurements particularly 
challenging [35,36]. However, we have been able to iden- 
tify the same overall kinetic pattern, as observed with 
the transcriptomic data. Due to this variability, we have 
not used the proteomic data to quantify the magnitude 
and velocity effects and the biological interpretations we 
have drawn from the proteomic analysis have been done 
with careful consideration. 

Previous work by Chang and coworkers, using a com- 
pendium of about 200 publicly available transcriptomic 
profiles of mouse lungs, showed differences in pathogen- 
icity among respiratory viruses were explained in part by 
the changes in magnitude of gene expression [37]. How- 
ever, due to constraints of the meta-analysis, time was 
not taken into consideration. Moreover, the different 
microarray platforms and mouse genetic backgrounds 
used in the different datasets introduced noise into the 
analysis. Here, we have expanded upon the findings by 
Chang et al. through a kinetic analysis that incorporates 
the dimension of time, uses a single mouse genetic back- 
ground and isogenic respiratory viruses that differ based 
on mutations to known pathogenicity determinants. 
Transcriptomic and proteomic samples were collected 
in a systematic manner to generate a comprehensive 
dataset that provides a powerful resource for modeling 
pathogen-host interactions. The large sample number al- 
lows to infer co-expression and co-regulation networks 
for identification of unknown associations and dynamic 
interactions between biological components. Moreover, 
the extensive collection of sampled time-points allows to 
model causality of the biological system for discovery of 
novel biological events. In addition, a large panel of ma- 
chine learning and data mining algorithms can be used, 
trained and tested based on this assembled dataset. While 
our study focuses on acute responses to H5N1 virus, 



other transcriptomic datasets, such as the one reported 
by Pommerenke and coworkers, encompass acute and 
adaptive host responses to influenza virus [38] . It will be 
important to consider changes during adaptive host re- 
sponses to fully appreciate the impact of magnitude and 
velocity kinetic effects on the outcome of viral infection. 

Several genomic studies have examined host responses 
to H5N1 virus in mice for a deeper understanding of 
molecular events impacting viral pathogenesis and dis- 
semination. Cilloniz and coworkers demonstrated robust 
transcriptional changes of inflammatory response genes, 
including upregulation of inflammasome genes (e.g., 
CASP1, IL1B, and NLRP3), in response to H5N1 VN1203 
compared to 1918 pandemic influenza virus at day 1 post- 
infection [6]. Investigation of the host response to viruses 
possessing specific mutations has also been investigated. 
For example, Fornek et al. showed that the H5N1 influenza 
A/Hong Kong/486/97 virus containing amino acid substi- 
tution E627K in PB2 upregulated TCR complex genes (e.g., 
CD3D and CD3G) in the lungs of infected mice at day 2 
post-infection [11]. It must be noted that the PB2 mutation 
is not the only difference between these two viruses. Our 
kinetic analysis further supports the likelihood of impaired 
T cell responses to H5N1 infection. In a separate study, 
a recombinant virus expressing H5N1 PB1-F2 N66S 
compared to the wild-type recombinant virus showed dif- 
ferences in the timing of interferon-stimulated gene ex- 
pression in mouse lung [20]. Proteomic analysis of primary 
human monocyte-derived macrophages infected with 
H5N1 virus have also been investigated and changes in 
protein abundance of components associated with pro- 
tein synthesis machinery have been observed as early as 
1 hour post-infection [39]. 

Importantly, these kinetic properties of the host response 
may not be fully captured by traditional statistical methods 
that largely depend on different assumptions and applied 
criteria. For instance, methods used for the identification 
of differentially expressed genes and over-represented 
pathways can be negatively impacted by 1) gene magni- 
tude changes, via the fold-change cutoff, 2) the dynamics 
of the host responses, as biological condition compari- 
sons are usually performed within a given time-point, 
and 3) the variability of the biological conditions, via the 
p-value cutoff. For example, the comparison of the 
VN1203 wild-type condition at 10 4 PFU and at 2 dpi with 
the VN1203-PBlF2del condition at 10 3 PFU and at 2 dpi 
will detect a large number of differentially expressed 
transcripts or proteins (Figure 5C). On the other hand, a 
comparison of the VN1203 wide-type condition at 10 4 
PFU and at 2 dpi with the VN1203-PBlF2del condition 
at 10 3 PFU and at 2 dpi will detect a smaller number of 
differentially expressed transcripts and proteins. How- 
ever, in both cases, the virus triggered the same host re- 
sponse mechanisms, but with different kinetics. 
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Compare to more traditional analysis methods, the 
Singular Value Decomposition analysis that we used for 
the kinetics analysis has several advantages that we 
exploited here. This analysis is based on an unsupervised 
method and does not require pulling the samples into 
distinct groups (i.e. the samples are not grouped by time 
points post-infection or by inoculation concentrations). 
This unsupervised aspect was crucial in our analysis be- 
cause of the shifts on the host response that we described 
in this study. Using this SVD analysis strategy, we have 
been able to successfully capture the kinetics pattern of 
the host response system to VN1203 wild-type without 
having to group the samples into particular bins. More- 
over, the kinetic patterns of the sub-biological mecha- 
nisms identified (eigentranscripts and eigenproteins) can 
be naturally displayed with this analysis strategy. These 
representations of eigentranscripts and eigenproteins 
show in a very concise way the kinetics profiles of all the 
associated transcripts and proteins. For example, we have 
been able to clearly observe the transient profile of tran- 
scripts associated with T-cell responses. With more trad- 
itional methods (i.e. gene modules, gene heatmaps or 
PCA representations in gene space), this kind of repre- 
sentation is not systematic and could produce large unin- 
telligible representations. 

The MDS Reference Maps and Projections representa- 
tions that we developed and used in the context of this 
study were particularly useful to drive our analysis. By 
using the VN1203 wild- type virus -omics profiles as a 
reference representation, we have been able to observe 
the behavior of the host response to the other variants. 
Traditional visualization representation methods do not 
naturally allow fixed positions of a subset of high- 
dimensional objects. In the context of the development of 
novel visualization methods [40-42], we showed the rele- 
vance of this approach for the analysis of large datasets. 

Conclusions 

The emergent properties of a biological system are dy- 
namic and best comprehended by examining the system 
through the sum of its individual components [43]. 
Through a systems biology approach, we have demon- 
strated the importance of the kinetics of the host re- 
sponse to H5N1 pathogenesis. Magnitude and velocity 
represent emergent properties of the system that are 
best captured using an expanded time-course, multiple 
doses and more than one mutant virus. Therefore, 
changes in gene expression induced by wild-type and 
mutant virus are not necessarily specific to a particular 
mutation, but rather encompass kinetic changes where 
increasing viral dosage is found to proportionally in- 
crease the Magnitude Coefficient and impact the timing 
of peak Velocity Coefficient during infection. Know- 
ledge of the timing of host response changes will have 



important implications for treatment administration, 
particularly early in infection. 

Methods 

Viruses 

All recombinant influenza viruses - including H1N1 CA04 
wild-type, the H5N1 VN1203 wild-type and the panel of 
four H5N1 VN1203 mutant viruses encoding changes in 
specific pathogenicity markers - were generated by using 
reverse genetics coupled with site-directed mutagenesis 
(for mutant viruses), essentially as described previously 
[18,44,45] .The H5N1 VN1203 mutant viruses included the 
following: a virus in which the hemagglutinin (HA) surface 
protein poly-basic cleavage site (RERRRKKR|G) was mu- 
tated to (RETR|G) (referred to as H5N1 VN1203-HAavir 
mutant), which has been previously described [2]; a 
virus possessing a K— >E amino acid substitution at 
residue 627 of the PB2 protein (referred to as H5N1 
VN1203-PB2627E), which was previously described [4]; a 
newly generated virus in which a stop codon mutation was 
introduced into the NS1 protein open reading frame at 
amino acid 124, which was achieved by a T— >A nucleotide 
substitution at position 400 of the H5N1 VN1203 NS gene 
segment (referred to as H5N1 VN1203-NSltrunc); and a 
newly generated virus in which the PB1-F2 open reading 
frame was eliminated by introduction of three stop codon 
mutations (affected nucleotide positions in the H5N1 
VN1203 PB1 gene segment: T120C, C153G, G291A; re- 
ferred to as H5N1 VN1203-PBlF2del). Mutations in the 
PB1-F2 and NS1 open reading frames did not affect the 
coding sequences of collinear PB1 and NS2 proteins, re- 
spectively. Primer sequences are available upon request. 

Infections and sample collection 

All experiments using live H5N1 and H1N1 viruses were 
performed in an animal enhanced biosafety level 3 (ABSL3 
and ABSL3+) containment laboratories at the University 
of Wisconsin-Madison and University of North Carolina, 
approved for use by the United States (US) Centers for 
Disease Control (CDC) and the US Department of Agri- 
culture. All animal experiments and procedures were ap- 
proved by the University of Wisconsin-Madison School 
of Veterinary Medicine Animal Care and Use Committee 
and by the University of North Carolina Institutional 
Animal Care and Use Committee. Twenty-week-old fe- 
male C57BL/6J mice (Jackson Laboratories) were anes- 
thetized by isoflurane inhalation or i.p. ketamine/xylazine 
injection and mock-infected or infected with 10 2 , 10 3 , 
10 4 , 10 5 or 10 6 Plaque Forming Units (PFU) of virus, as 
indicated in the text, figures and figure legends. Viruses 
were diluted in Minimum Essential Medium (MEM) 
containing 0.3% BSA, and the same medium without virus 
was used for mock infections. Three to five mice from both 
mock-infected and infected groups were euthanized at days 
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1, 2, 4 and 7 post-infection, and lungs were harvested 
and divided into multiple lobes for virus titration, 
RNA extraction or protein extraction. For each wild- 
type or mutant virus, infections were performed in in- 
dependent experiments, and the same lung lobe was 
collected for each analysis method from every mouse. 
Importantly, we did not detect any differences in the 
amount of virus recovered from any lung lobe, indi- 
cating there was no bias in the distribution of viral infec- 
tion in the mouse lungs (Additional file 8: Figure S5). 
Individual or group mouse body weights were collected 
on a daily basis to monitor the disease course, and 
mice were humanely euthanized upon reaching the ex- 
perimental endpoint (i.e., sample collection or severe 
clinical symptoms). 

Median Lethal Dose (MLD) analysis 

To determine Median Lethal Dose values (MLD) for 
VN1203-WT and mutants, we infected 6-week-old 
BALB/C mice with serial dilutions of each virus (3 mice 
per dilution) and monitored individual mouse body weight 
loss and survival over 14 days. All mice were humanely eu- 
thanized either when exhibiting severe clinical symptoms, 
or at the end of the 14 day observation period. MLD values 
were calculated according to the method of Reed and 
Muench [46]. 

Virus titration 

Lung tissues were homogenized using a TissueLyser 
(Qiagen) or MagnaLyser (Roche) in 1 ml of MEM 
containing a penicillin/streptomycin mixture (Life Tech- 
nologies), and following centrifugation to remove debris, 
virus titrations were performed in Madin-Darby Canine 
Kidney (MDCK) cells using standard methods. Virus 
titers are represented as the total number of infectious 
viruses isolated per lobe. 

Viral mRNA and genomic RNA measurements 

Quantification of influenza vRNA and mRNA transcripts 
was assessed using a strand-specific real-time RT-PCR 
method previously described [47]. Briefly, H5N1 cDNAs 
complementary to the two types of influenza viral RNA 
from the NP genomic segment were synthesized with 
tagged primers to add an 18-20 nucleotide tag that was 
unrelated to the influenza virus at the 5' end (vRNAtag; 
GGCCGTCATGGTGGCGAAT and mRNAtag; CCAG 
ATCGTTCG AG TCG T) . To the mixture, 4 ul of the 
RNA sample was added with 1.5 ul of 10 uM primer 
specific for the NP genomic segment vRNA (5'-GG 
CCGTCATGGTGGCGAATAGCAAAAGCAGGGTAG 
ATAATCACTC-3') or mRNA (5'-CCAGATCGTTCGAG 
TCGTTTTTTTTTTTTTTTTTCTTTAATTGTC-3') and 
made up to 13 ul with RNase free water. The mixture 
was then incubated at 65°C for 5 min and was cooled 



to 4°C. The reaction mixture [5 ul of First Strand buffer 
(5x, Invitrogen, Carlsbad, CA), 4 ul of 25 mM MgCl2, 2 
ul of 0.1 M dithiothreitol and 1 ul of Superscript II RT 
(50 U/ul, Invitrogen)] was then added. The RT reaction 
was carried out at 42°C for 60 min and was terminated 
by heating at 70°C for 5 min. After the RT reaction, 
SYBR-Green I real time PCR was then carried out using 
diluted cDNA as a template, in which 1 ul of cDNA 
was added to the qPCR reaction mixture [10 ul SYBR 
GreenERqPCRSuperMix for ABI PRISM (2x), 1.5 ul of 
each primer for the H5N1 NP genomic vRNA (10 uM; 
5'-GGCCGTCATGGTGGCGAAT-3' and 5-GTTCCCCA 
CCAGTTTCCATC-3') or mRNA (5-CCAGATCGTTCG 
AGTCGT-3' and 5-CGAACCCGATCGTGCCTTCC-3'), 
3 ul of double-distilled water]. The cycle conditions of 
qPCR were 95°C 10 min, followed by 40 cycles of 95°C 
15 s and 60°C for 1 min. H1N1 cDNA was created simi- 
larly as above with the NP genomic segment vRNAoligo 
listed above or mRNA-specific oligo 5'-CCAGATCGTT 
CGAGTCGTTTTTTTTTTTTTTTTTCAACTGTCATA 
CTC-3'. The genomic vRNA PCR reactions were as- 
sembled identically as above for the H5N1 virus, and 
mRNA reaction was assembled with the mRNA oligo 
tag primer ^-CCAGATCGTTCGAGTCG-S' and H1N1- 
specific NP mRNA primer 5'-GCTCCCCACCAGTCTCC 
ATT-3'. As an endogenous control, RPL10 mRNA level 
was measured using primers 5-TGAAGACATGGTTG 
CTGAGAAG-3' and 5'-GAACGATTTGGTAGGGTAT 
AGGAG-3'. 

RNA isolation and microarray processing 

Lung tissues were directly submerged in RNA-Later 
stabilization solution (Life Technologies) following dis- 
section, placed at 4°C overnight, followed by freezing 
at -80°C. Lung tissues were thawed, transferred to 1 ml 
TRIzol (Life Technologies) and homogenized using a 
TissueLyser or MagnaLyser. RNA was isolated using 
QiagenRNeasy Mini columns and the manufacturers 
recommended protocol (Qiagen Inc., Valencia, CA). Fluo- 
rescently labeled probes were generated from each RNA 
sample using Agilent one-color Low Input Quick Amp 
Labeling Kit (Agilent Technologies). Individual cRNA 
samples were hybridized to oligonucleotide microarrays 
for gene expression profiling using Whole Mouse Genome 
Microarray Kit (Agilent Technologies). All the microarray 
experiments passed the quality control criteria of the 
Agilent Feature Extraction Software. 

Transcriptomics data extraction and normalization 

Transcriptomics data have been extracted from the 
AgilentWhole Mouse Genome Microarray raw data 
and scale normalized [48-50] using the 'normalize 
BetweenArrays' normalization method available in limma 
package [51] of the R suite [52]. 
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Proteomics sample preparation 

All chemicals and reagents were purchased from Sigma- 
Aldrich (St Louis, MO) unless stated otherwise. Am- 
monium bicarbonate and acetonitrile were purchased 
from Fisher Scientific (Pittsburgh, PA), and sequencing- 
grade modified trypsin was purchased from Promega 
(Madison, WI). Bicinchoninic acid (BCA) assay reagents 
and standards were obtained from Pierce (Rockford, IL); 
and purified, deionized water, >18 MH, (Nanopure Infinity 
ultrapure water system, Barnstead, Dubuque, IA) was used 
to make all aqueous buffers. Dissected lung tissues were 
directly frozen at -80°C To prepare protein extracts, 
thawed tissues were suspended in 1 ml of 8 M urea in 
50 mM NH4HCO3 and homogenized using a TissueLyser 
or Magnalyser. Protein concentrations of the cleared ho- 
mogenates were determined by BCA protein assay and 
diluted to uniform volume in 50 mM ammonium bicar- 
bonate, pH 7.8. Proteins were reduced with 10 mM 
dithiothreitol, followed by alkylation of free sulfhydryl 
groups with 40 mM iodoacetamide at 37°C in the dark; 
each reaction was performed for 1 h at 37°C with constant 
shaking at 550 rpm. Denatured and reduced samples were 
diluted 10-fold with 50 mM ammonium bicarbonate, pH 
7.8, and CaCl 2 was added to a final concentration of 1 mM 
prior to enzymatic digestion. Sequencing-grade modified 
trypsin was activated by adding 20 \xL of 50 mM ammo- 
nium bicarbonate, pH 7.8, to 20 \ig lyophilized trypsin and 
incubating for 10 min at 37°C. Activated trypsin was then 
added to the samples at 1:50 (w/w) trypsin-to-protein ratio, 
and samples were digested at 37°C for 3 h with constant 
shaking at 800 rpm; reactions were quenched by rapid 
freezing in liquid nitrogen. Digested samples were desalted 
using solid phase extraction columns (Discovery CI 8, 
Supelco, Bellefonte, PA), according to the manufacturers 
instructions. Samples were concentrated to 100 \iLin vacuo 
(Speed-Vac SC 250 Express, Thermo Savant, Holbrook, 
NY), and a BCA protein assay was performed to ver- 
ify final peptide concentrations. Samples were stored 
at -80°C until strong cation exchange fractionation 
with liquid chromatography-tandem mass spectrometry 
(LC-MS/MS) or quantitative LC-MS analyses. 

Strong cation exchange fractionation 

Strong cation exchange fractionation was performed as 
previously described [53]. Twenty four fractions were 
collected from minute 30 to minute 65 of the gradient, 
and they were subsequently dried in vacuo and stored 
at -80°C until LC-MS/MS analysis. 

Reversed-phase capillary LC-MS/MS and LC-MS analyses 

Capillary LC-MS/MS analysis was used to generate a 
peptide accurate mass and time (AMT) tag database 
[54] for virus -infected lung homogenates (see below). 
For this, dried peptide fractions were reconstituted in 



30 \iL of 25 mM ammonium bicarbonate, pH 7.8, and ana- 
lyzed using a 4-column custom-built capillary LC system 
coupled online to a linear ion trap mass spectrometer 
(LTQ; Thermo Scientific, San Jose, CA) by way of an in- 
house manufactured nanoelectrospray ionization interface, 
as previously described [55]. To identify the eluting pep- 
tides, the LTQ stage was operated in a data-dependent 
MS/MS mode as previously described [53]. 

Following lung and virus peptide/protein AMT tag 
database generation, capillary LC-MS analyses were 
performed on all individual H5N1 and mock-infected 
samples to generate quantitative data. For this, dried pep- 
tide samples were reconstituted in 30 \iL of 25 mM 
ammonium bicarbonate, pH 7.8, and analyzed in du- 
plicate and random order using identical chromato- 
graphic and electrospray conditions as for capillary 
LC-MS/MS analyses. The LC system was interfaced to an 
ExactiveOrbitrap mass spectrometer (Thermo Scientific), 
and the temperature of the heated capillary and the ESI 
voltage were 250°C and 2.2 kV, respectively. Data were col- 
lected over a range of 400-2,000 m/z. 

Development of the AMT tag database for virus-infected 
C57BL/6J mice 

An AMT tag database of identified peptides was devel- 
oped for virus-infected lungs, using mock-infected and 
virus-infected samples from two studies: 1) H5N1 data 
reported here and 2) another with SARS-CoV MAI 5. 
The details of study 2 will be described elsewhere. To 
generate the AMT tag database, aliquots of the H5N1- 
or mock-infected samples were combined to make the 
following pools: 1) mock-infection (all time points), 2) 
early infection (1 and 2 d), and 3) late infection (4 and 7 d). 
Samples from the SARS-CoV MA15 studies were simi- 
larly pooled. Each pool was subjected to strong cation 
exchange fractionation as described above, and each 
fraction was analyzed by nanocapillary LC-MS/MS. 
The SEQUEST analysis software [56] was used to match 
the MS/MS fragmentation spectra with sequences from 
the April 20, 2010 UniProt/Swiss-Prot rodent protein data- 
base, containing 16,244 entries (protein TITIN_Mouse was 
removed due to excessive length). The data were also 
matched to sequences from the H5N1 viral proteome. 
When searching, SEQUEST used a dynamic mass modifi- 
cation on methionine residues corresponding to oxidation 
(15.99 Da) and a static mass modification on cysteinyl 
residues to account for alkylation by iodoacetamide 
(57.02 Da). Peptides passing the following filter criteria 
were stored as AMT tags in a Microsoft SQL Server data- 
base: 1) SEQUEST DelCn2 value (normalized Xcorr differ- 
ence between the top scoring peptide and the second 
highest scoring peptide in each MS/MS spectrum) > 0.10 
and 2) SEQUEST correlation score (Xcorr) > 1.6, 2.4, and 
3.2 for fully tryptic peptides with 1+, 2+, and 3+ charge 
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states, respectively, and Xcorr > 4.3, and 4.7 for partially 
tryptic or non-tryptic protein terminal peptides with 2+, 
and 3+ charge states, respectively. Fully non-tryptic pep- 
tides were excluded, and a minimum peptide length of 
6 amino acid residues was required. These criteria 
resulted in 39,744 peptides identified with an estimated 
false discovery rate of <2% based on a target-decoy data- 
base search [57]. The elution times for these peptides 
were normalized to a range of 0 to 1 using a predictive 
peptide LC normalized elution time (NET) model and 
linear regression, as previously reported [58]. A NET 
average and standard deviation were assigned to each 
identified peptide if the same peptide was observed in 
multiple analyses. Both calculated monoisotopic masses 
and observed NETs of identified peptides were included 
in the AMT tag database. Identified MS/MS spectra 
corresponding to peptides in the AMT tag database 
are available at the PRoteomicsIDEntification (PRIDE) 
database (http://www.ebi.ac.uk/pride/) under the pro- 
ject name A Systems Biology Approach to Emerging 
Respiratory Viral Diseases in the PRIDE Public Projects 
folder and corresponding to PRIDE Accession numbers 
19855-19860. 

Processing of quantitative LC-MS datasets 

Quantitative LC-MS datasets were processed using the 
PRISM Data Analysis system [59], which is a series of 
software tools developed in-house (e.g. Decon2LS [60] 
and VIPER [61]; freely available at http://ncrr.pnl.gov/ 
software/). Individual steps in this data processing ap- 
proach are reviewed here [54]. The peptide identities for 
detected features in each dataset (i.e. a single LC-MS 
analysis) were determined for features matched to AMT 
tags with high confidence based upon the accurate mea- 
sured monoisotopic masses and NETs for each of the 
39,744 peptides in the filtered AMT tag database within 
initial search tolerances of ± 6 ppm and ± 0.025 NET for 
monoisotopic mass and elution time, respectively. The 
peptides identified from this matching process were 
retained as a matrix for subsequent data analysis. The raw 
quantitative proteomics data can be accessed at the PNNL 
Biological MS Data and Software Distribution Center 
(http://omics.pnl.gov/) in the Systems Virology Contract 
Data folder within the Browse Available Data folder. 

Proteomics data processing, missing values imputation, 
and normalization 

Quality control processing was performed to identify and 
remove contaminant proteins, redundant peptides, pep- 
tides with an insufficient amount of data across the set of 
samples [62], and LC-MS datasets that showed signifi- 
cant deviation from the standard behavior of all LC-MS 
analyses [63]. The peptide abundance values were normal- 
ized across the technical replicates using a global median 



centering of the data. [64]. The normalized loglO 
abundance values were averaged across the technical repli- 
cates within each biological sample. Peptide level signifi- 
cance patterns were used for protein roll-up to select 
appropriate peptides for protein quantification. Proteins 
were quantified using a standard R-Rollup method using 
the most abundant reference peptide [65], after filtering 
the peptides that were redundant, had low data content, or 
were outside the dominant significance pattern. 

Missing expression values into each proteomics profile 
have been inferred using the 'impute.knn k-nearest neigh- 
bor imputation method [66] available in the Imputation' 
package [67] of the R suite [52]. In order to avoid inconsist- 
ent missing value imputations, the nearest neighbors have 
been restricted to proteomics profiles belonging to the 
same biological condition. The number of nearest neigh- 
bors used in the imputation (the k parameter) has been set 
to 2/3 of the number of samples of the biological condi- 
tion. Out of the 1,069,500 (3,875x276) proteomic expres- 
sion values, 450,814 (42.15%) had to be inferred. Biological 
conditions at the earliest time points post-infection showed 
the highest degree of imputation, while the biological con- 
ditions at the latest time points post-infection showed the 
lowest degree of imputation. When expression values for a 
protein having been found as missing in more than 
2/3 of the number of samples of the biological condition, 
the expressions values have been set to 10E-4 in order to 
avoid infinite fold-changes values when comparing to 
other conditions. 

The whole compendium of proteomics expression pro- 
files center based on the value of the 9 th decile and scale 
normalized [48-50] using the normalizeBetween Arrays' 
normalization method available in limma package [51] of 
the R suite [52]. 

Identification of differentially expressed transcripts 
and proteins 

Statistically significant differentially expressed transcripts 
and proteins have been identified using the CDS method 
[68]. The CDS statistical test allows the identification of 
differentially but also similarly expressed transcripts and 
proteins between two biological conditions using a stat- 
istical hypothesis test that formulates fold-change state- 
ments. We used a fold-change parameter of 2 for the 
identification of differentially expressed transcripts and a 
fold-change parameter of 1.2 for the identification of dif- 
ferentially expressed proteins. Transcripts or proteins 
having a p-value generated by this statistical test lower 
than 0.01 have been asset as differentially expressed. 

Identification of over-represented biological functions, 
canonical pathways and upstream regulators 

Functional enrichment of biological functions, canonical 
pathways and upstream regulators was performed using 
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Ingenuity Pathways Analysis (Ingenuity Systems, Inc.). 
IPA examines differentially expressed transcripts and 
proteins in the context of known biological functions, 
mapping each gene identifier to its corresponding mol- 
ecule in the Ingenuity Pathways Knowledge Base (IPKB). 
For all analyses, the p-values were generated using the 
right- tailed Fisher's Exact Test [69]. 

Upstream Regulator Analysis in IPA incorporates down- 
stream target genes from the experimental dataset and 
compiled knowledge of reported relationships between reg- 
ulators and their known target genes within IPKB. This 
analytical tool was used to predict the upstream regulators 
in our analysis. 

Multidimensional scaling representations 

Multidimensional Scaling representations have been gener- 
ated using the SVD-MDS method [29]. Multidimensional 
scaling methods represent the similarities and differences 
amongst high dimensionality objects into a space having a 
low number of dimensions, usually in 2 or 3 dimensions 
for visualization purposes [28]. Pairwise distances between 
the dots are proportional to the transcriptomic or prote- 
omic distances between the samples. The Kruskal Stress 
criterion [28] shown in the representations, quantifies the 
quality of the representations as a fraction of the informa- 
tion lost during the dimensionality reduction procedure. 
The SVD-MDS method performs the dimensional reduc- 
tion procedure by using a molecular dynamics approach 
and by initialing the metric space using a singular value 
decomposition method, resulting in better representa- 
tion of the similarities and differences in term of infor- 
mation conservation. 

Identification of transcripts and proteins correlating with 
the kinetics pattern of the host response to VN1203 
wild-type 

Transcriptomic and proteomic intensity expression matri- 
ces of the wild-type VN1203 infected samples have been 
factorized using a Singular Value Decomposition method. 
Transcripts and proteins that correlated with at least one 
eigentranscript or eigenprotein (right- singular vectors) 
were then identified using the Pearsons coefficient of cor- 
relation. We used a cutoff of 0.65 on the Pearsons coeffi- 
cient of correlation and a cutoff of 0.01 on the p-value 
associated with this correlation coefficient. The p-values 
associated with Pearsons coefficients of correlation have 
been obtained using the Student's t distribution for a trans- 
formation of the correlation. Mock infected biological con- 
ditions were not used in the inference of eigentranscripts 
and eigenproteins, and biological conditions at 1 dpi and 2 
dpi were not used in the inference of the eigenproteins be- 
cause of the relatively low levels of proteins identified as 
differentially expressed at these time-points. In order to 
discard transcripts or proteins that could contribute to 



noise the analysis, only the transcripts and proteins found 
as differentially expressed in at least one biological condi- 
tion have been used in the SVD analysis. 

The SVD analysis has been performed as described 
here after. Both the transcriptomic and proteomic ex- 
pression matrices M (mxn, for m genes and n samples) 
have been factorized in the form of: 

M=USV t where U is a real unitary matrix (mxm), S is a 
rectangular diagonal matrix (mxn) with nonnegative 
real numbers on the diagonal, and V t (the conjugate 
transpose of V) is a real unitary matrix (nxn). 

These factorizations have been done using the "svd" 
function of the R suite [52]. 

The V matrices have been transposed in order to obtain 
the eigenvectors (eigentranscripts or eigenproteins). As 
52 transcriptomic samples have been used in the SVD 
factorization, then 52 eigentranscripts have been identi- 
fied. Respectively, as 22 proteomic samples have been 
used in the SVD factorization, then 22 eigenproteins have 
been identified. 

Using an iterative procedure, each vector of tran- 
script or protein expression values (of the original 
expression matrices) have been compared to the identified 
eigentranscripts or eigenproteins. The statistical compari- 
son has been done using the Pearson s coefficient of correl- 
ation as described before. 

We identified a total of 3 eigentranscripts for which at 
least 5 transcripts were correlating, and we identified 
a total of 3 eigenproteins for which at least 5 proteins 
were correlating. 

Magnitude and velocity coefficients 

The Magnitude Coefficient (MC) quantifies the magnitude 
effect as the transcriptomic distance from one biological 
condition to the matched mock-infected condition. The 
Velocity Coefficient (VC) quantifies the velocity effect as 
the speed of the transcriptomic host response to move 
from one time point to the next one. Both the MC and VC 
were calculated based on the centroids (arithmetic means) 
of the biological conditions and the transcriptomic dis- 
tances (Euclidian distances) are calculated based on the 
lists of transcripts associated with the kinetics of the host 
response to VN1203 wild-type. Namely the MC and VC of 
a biological condition bc t being the mean vector of gene 
expression levels amongst the biological replicates at the 
time point t are defined as: 

MC(bc t ) = dist(mock, bc t ) and VC(bc t ) = - ^ — t ^— 

1 n 

With: dist(cl, c2) = (cl n c2if 
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where: mock represents the mean vector of gene expres- 
sion levels amongst the matched mock-infected sam- 
ples, At is the amount of time between t and t+1 in 
days, and n is the length of the mean vectors of gene 
expression. 



VN1203-PBlF2del mutant at 10E-3 PFU, and GSE44441 
for the H5N1 VN1203-PBlF2del mutant at 10E-4 PFU. 



Additional files 



Multidimensional scaling reference maps and projections 

Multidimensional Scaling Reference Maps (MDS Ref 
Maps) and projections (MDS Projections) have been 
obtained using a modified version of the SVD-MDS 
method that we developed and used for this study. 
While the original SVD-MDS method presented in [29] 
projects a set of high dimensionality objects into a 2 or 
3 dimensional metric space, the modified SVD-MDS 
method allows to project additional objects over a pre- 
defined MDS representation. The predefined MDS rep- 
resentation is named MDS Reference Map, and the 
overlaid MDS representations are named MDS Projec- 
tions. The original SVD-MDS method performs the di- 
mensional reduction of the objects by using a molecular 
dynamics approach, modeling objects by particles and 
pairwise distances between them by repulsion and at- 
traction forces. The modified SVD-MDS method assigns 
an infinite mass to each object (i.e. particle) of the MDS 
Reference Map, resulting in a projection of the add- 
itional objects over the predefined representation. The 
Kruskal Stress criterion [28] shown in the representa- 
tions quantifies the quality of the representations as a 
fraction of the information lost during the dimensional- 
ity reduction procedure. 

Availability of supporting data 

Mice initial body weight loss, viral titer, viral RNA 
and viral mRNA measurements of the datasets used 
in this study are available on the Systems Virology web- 
site (http://www.systemsvirology.org) on the 'PROJECT 
FOLDERS > Home > Data & Resources > Experimental 
Metadata. The whole compendium of phenotypical 
variables measurements used in this study is available 
under the accession 'ST003\ 

Proteomics processed data are available on the System 
Virology website (http://www.systemsvirology.org) on the 
'PROJECT FOLDERS > Home > Data & Resources > 
Proteomic DataPortal'. The normalized expression matrix 
of the whole compendium of proteomics profiles used in 
this study is available through the accession id ST003. 

Transcriptomic raw data used in this study are 
available via the following accession identifiers: GSE37569 
for the H1N1 CA04 wild-type virus, GSE33263 for 
the H5N1 VN1203 wild-type virus, GSE37572 for the 
H5N1 VN1203-HAavir mutant, GSE43301 for the 
H5N1 VN1203-PB2627E mutant, GSE44445 for the H5N1 
VN1203-NSltrunc mutant, GSE43302 for the H5N1 



Additional file 1: Figure SI. Representation of the collected 
transcriptomic and proteomic profiles of infected mouse lungs along the 
timeframe. We collected 230 transcriptomic and 198 proteomic profiles 
of C57BL76 mouse lung infected by two wild-type (WT) influenza viruses - 
the H1N1 CA04-V\^and the H5N1 VN1203-VW viruses - and 4 mutants of 
the H5N1 VN1203-VWvirus-the H5N1 VN1203-HAavir, H5N1 VN1203- 
PB2627E, H5N1 VN1203-NS1trunc, and H5N1 VN1203-PB1 F2del - at different 
dosage concentrations (10 2 , 10 3 , 10 4 , 10 5 , and 10 6 Plaque-Forming Unit - 
PFU). The transcriptomic and proteomic profiles have been obtained at 
different days post-infection (1, 2, 4, and 7 days post-infection - dpi). Each 
dot in the representation represents a transcriptomic profile and each square 
represents a proteomic profile of a mouse lung sample along the time frame 
for the different viruses and infection concentrations. Samples of this dataset 
are gathered into 51 transcriptomic biological conditions and 42 proteomic 
biological conditions (i.e. set of biological replicates infected by the same 
virus, at the same infection concentration, and from the same time point 
post-infection), 13 transcriptomic and 11 proteomic dosage conditions 
(i.e. set of biological replicates infected by the same virus, and at the same 
infection concentration), and 6 strain conditions (i.e. set of biological replicate 
infected by the same virus). Time-matched mock-infected transcriptomic and 
proteomic profiles have also been collected leading to a total number of 300 
transcriptomic and 266 proteomic profiles in our dataset. 

Additional file 2: Figure S2. Statistical comparison of titer, mRNA and 
genomic RNA. Graphical table showing the statistical comparisons of viral 
titer, viral mRNA and viral genomic RNA measurements between all the 
infection conditions and at all the time points post-infection. The 
Student's t-test has been used to asset the statistical differences between 
the groups for all the different kind of measurements. Comparisons 
having a p-value less than 0.05 have been colored in red. 

Additional file 3: Table SI. List of transcripts correlating with at least 
one eigentranscript. For each transcript correlating with at least one 
eigentranscript identified in the kinetics of host response to H5N1 VN1203 
wild-type, the probe identifier, the associated gene name, the associated 
eigentranscript, and the Pearson's coefficient of correlation are indicated. 

Additional file 4: Table S2. List of proteins correlating with at least one 
eigenproteins. For each protein correlating with at least one eigenprotein 
identified in the kinetics of host response to H5N1 VN1203 wild-type, the 
UniProt/SwissProt accession, the associated gene name, the associated 
eigenprotein, and the Pearson's coefficient of correlation are indicated. 

Additional file 5: Table S3. Magnitude and Velocity Coefficients and 
the phenotypical variables for each transcriptomic biological condition. 
For each transcriptomic biological condition, the Magnitude Coefficient 
(MC) - quantifying the Euclidian distance to the matched mock-infected 
condition - and the Velocity Coefficient (VC) - quantifying the speed of 
the host response to move to the next time point - are indicated as well 
as the mean of mice body weight loss (mean body weight) and the 
difference of the means of viral titer (A mean viral titer), viral mRNA 
(A mean viral mRNA), and mean viral genomic RNA (A mean viral gRNA) 
between two consecutive time-points. 

Additional file 6: Figure S3. Heatmap and over-represented pathways 
of the host response to VN1203-V\^~ and VN1203-NS1trunc at 1 dpi. 
(A) Heatmap of the transcript expression signals for the VN1203-V\^T and 
VN1203-NS1trunc infected transcriptomic profiles at 1 day-post-infection 
and for the 10 4 PFU inoculation dosage. Values are shown as log2 ratioed 
to the mocks infected samples. The dendrogramm representing 
transcripts clustering has been constructed using the Euclidian metric 
and complete-link clustering method. (B) Functional enrichment on the 
800 transcripts found as over-regulated in VN1203-NS1trunc. 104 
significantly over-represented pathways have been identified (p-value 
cutoff of 0.05) and the top 45 is indicated. For each top over-represented 
canonical pathway the p-value (shown as -loglO) is indicated. 
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Additional file 7: Figure S4. MDS Projection of transcriptomic profiles 
of the CA04 wild-type infected samples over the VN1203 wild-type MDS 
Reference Map. (A) Multidimensional Scaling Projection (MDS Projection) 
of the transcriptomic profiles of the 10 3 PFU, 10 4 PFU, 10 5 PFU and 10 6 
PFU H1N1 CA04 wild-type dosage conditions over the H5N1 VN1203 
wild-type Multidimensional Scaling Reference Map (MDS Reference Map). 
Each dot in the representations is the transcriptomic profile of a 
biological sample plotted in the intensity space of gene expression. 
Pairwise distances between the dots are proportional to the 
transcriptomic distances between the samples. MDS Projections allow to 
project -omics profiles over a predefined Multidimensional Scaling (MDS) 
representation. Euclidian distances have been calculated based on the 
signature of transcripts that significantly correlate with one 
eigentranscript. Dots are colored in order to indicate the dosage 
conditions, and biological conditions are indicated by the convex hull of 
the set of biological replicates (i.e. the smallest convex set containing the 
points [30]) and labeled to indicate the time point post-infection. Samples 
and biological conditions of the H5N1 VN1203 wild-type 10 4 PFU infection 
dosage are indicated by gray dots and gray convex hulls. Hence the grey 
spots that are connected represent the transcriptomic profiles of mice 
lung infected by the VN1203 wild-type virus at 10 4 PFU, while the ones 
not connected represent the transcriptomic profiles for the other infection 
concentrations. The Kruskal Stress shown in each representation quantifies 
the quality of the geometrical representation as a fraction of the 
information lost during the dimensionality reduction procedure. 

Additional file 8: Figure S5. Lung homogeneity. (A) Schematic 
representation of a lung with the different lobes used for the experiments 
and measurements. (B) Barplot representation of the viral titer measurements 
showing that there is no difference in the different lung lobes. 
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